98¢ /MFlop, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

نویسنده

Gordon Bell

چکیده

Artificial neural networks with millions of adjustable parameters and a similar number of training examples are a potential solution for difficult, large-scale pattern recognition problems in areas such as speech and face recognition, classification of large volumes of web data, and finance. The bottleneck is that neural network training involves iterative gradient descent and is extremely computationally intensive. In this paper we present a technique for distributed training of Ultra Large Scale Neural Networks1 (ULSNN) on Bunyip, a Linux-based cluster of 196 Pentium III processors. To illustrate ULSNN training we describe a preliminary experiment in which a neural network with 1.8 million adjustable parameters is being trained to recognize machine-printed Japanese characters from a database containing 6 million training patterns. The simulation is still underway, with an average performance during the first 56 hours of operation (the elapsed time in the simulation prior to this paper’s submission) of 152 GFlops (single precision). With a machine cost of $149,500, this yields a price/performace ratio of 98¢/MFlop (single precision). For comparison purposes, training using double precision and the ATLAS DGEMM produces a sustained performance of 70 MFlops or $2.13 / MFlop (double precision).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

98¢/Mflops/s, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

متن کامل

92¢ /MFlops/s, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

متن کامل

پیش‌‌بینی کوتاه مدت قیمت تراکم گرهی در یک سیستم قدرت بزرگ تجدید ساختار یافته با استفاده از شبکه‌های عصبی مصنوعی با بهینه‌سازی آموزش ژنتیکی

In a daily power market, price and load forecasting is the most important signal for the market participants. In this paper, an accurate feed-forward neural network model with a genetic optimization levenberg-marquardt back propagation (LMBP) training algorithm is employed for short-term nodal congestion price forecasting in different zones of a large-scale power market. The use of genetic algo...

متن کامل

Implementation of Parallelizing Multi-layer Neural Networks Based on Cloud Computing

Background: Cloud computing, as a technology developed under the rapid development of modern network, is mainly used for processing large-scale data. The traditional data mining algorithms such as neural network algorithm are usually used for processing small-scale data. Therefore, the calculation of large-scale data using neural network algorithm must be based on cloud computing. Materials and...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

98¢ /MFlop, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

نویسنده

چکیده

منابع مشابه

98¢/Mflops/s, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

92¢ /MFlops/s, Ultra-Large-Scale Neural-Network Training on a PIII Cluster

پیش‌‌بینی کوتاه مدت قیمت تراکم گرهی در یک سیستم قدرت بزرگ تجدید ساختار یافته با استفاده از شبکه‌های عصبی مصنوعی با بهینه‌سازی آموزش ژنتیکی

Implementation of Parallelizing Multi-layer Neural Networks Based on Cloud Computing

Learning Document Image Features With SqueezeNet Convolutional Neural Network

عنوان ژورنال:

اشتراک گذاری